Informative Plots

Time Series

If we see a trend, the time series is not stationary (i.e. does not depend on the time of the observation).

Seasonal plot

If we see a seasonality, the time series is not stationary (i.e. does not depend on the time of the observation).

Elizabeth_II

Je sais pas pk ca tourne pdt mille ans sans voir le résultat

  • time series decompostion

Autocorrelation plot

Autocorrelation quantifies the relationship between lagged values of a time series.

Modified Series

\[ R_i = \frac{X_i - X_{i-7}}{X_{i-7}} * 100 \] \(R_i\) is the modified time series where each series (above \(R_8\)) of pageviews is the relative percentage change time series between one observation at time \(i\) and the seventh before \(i-7\) ?

2016_Summer_Olympics

To compare the time series before and after modification

Peaks-Over-Threshold

## $y
## [1] "Latitude of Seismic Events"
## 
## attr(,"class")
## [1] "labels"

## $y
## [1] "Latitude of Seismic Events"
## 
## attr(,"class")
## [1] "labels"

## $y
## [1] "Latitude of Seismic Events"
## 
## attr(,"class")
## [1] "labels"

## $y
## [1] "Latitude of Seismic Events"
## 
## attr(,"class")
## [1] "labels"

## $y
## [1] "Latitude of Seismic Events"
## 
## attr(,"class")
## [1] "labels"

## $y
## [1] "Latitude of Seismic Events"
## 
## attr(,"class")
## [1] "labels"

## $y
## [1] "Latitude of Seismic Events"
## 
## attr(,"class")
## [1] "labels"

## $y
## [1] "Latitude of Seismic Events"
## 
## attr(,"class")
## [1] "labels"

## $y
## [1] "Latitude of Seismic Events"
## 
## attr(,"class")
## [1] "labels"

## $y
## [1] "Latitude of Seismic Events"
## 
## attr(,"class")
## [1] "labels"

## $y
## [1] "Latitude of Seismic Events"
## 
## attr(,"class")
## [1] "labels"

## $y
## [1] "Latitude of Seismic Events"
## 
## attr(,"class")
## [1] "labels"

Suitability of POT: - Princess Margaret, United Kingdom, United States (?), Wiston Churchill do not seem to be suitable for POT because of the too few numbers of exceedances.

library(evd)
## Warning: package 'evd' was built under R version 4.1.2
# data frame for the 99 quantile and measure of uncertainty for all the type

data99 <- data.frame(matrix(0, nrow = 2, ncol = length(unique(ts$type))))
colnames(data99) <- unique(ts$type)


for (i in 1:ncol(data99)){

  # filter for the type
   ts_type <- ts %>% 
   filter(type == names(data99)[i]) 
 
   # remove na
 ts_type <- ts_type %>% 
   filter(!is.na(`daily count modified`))
 
 # compute 99 quantile
 quantile99 <- quantile(ts_type$`daily count modified` , 0.99)
 
 # save the quantile in the data.frame
 data99[1,i] <- quantile99 
 
 # measure of uncertainty
 
 # not sure about the mper argument
 # doc of the function here : https://www.rdocumentation.org/packages/evd/versions/2.3-3/topics/fpot
 uncertainty <- fpot(ts_type$`daily count modified`, threshold = thresholds[i], mper = quantile99)
 
 # not sure if we need to save the r level or shape
 data99[2,i] <- uncertainty$std.err[1]
}
## Warning in fpot.quantile(x = x, threshold = threshold, start = start, npp =
## npp, : optimization may not have succeeded
## Warning in fpot.quantile(x = x, threshold = threshold, start = start, npp =
## npp, : optimization may not have succeeded

## Warning in fpot.quantile(x = x, threshold = threshold, start = start, npp =
## npp, : optimization may not have succeeded

## Warning in fpot.quantile(x = x, threshold = threshold, start = start, npp =
## npp, : optimization may not have succeeded

Detecting Simultaneous High Load

for detecting simultaneous high load across the 12 series provided, Which pages seem to have simultaneous high load?

# voir module 4

library(extRemes)
## Warning: package 'extRemes' was built under R version 4.1.2
## Loading required package: Lmoments
## Warning: package 'Lmoments' was built under R version 4.1.2
## Loading required package: distillery
## 
## Attaching package: 'extRemes'
## The following objects are masked from 'package:evd':
## 
##     fbvpot, mrlplot
## The following objects are masked from 'package:stats':
## 
##     qqnorm, qqplot
# idea for graphical representation : block maxima by week colored by type
# https://rdrr.io/cran/extRemes/man/blockmaxxer.html
tsnona <- ts%>% filter(!is.na(`daily count modified`))

# compute block maxima
bm <- blockmaxxer(tsnona, blocks = tsnona$date, which="daily count modified")

# need to to color by variable type
plot(tsnona$date, tsnona$`daily count modified`, xlab = "Year", ylab = "daily count modified",
    cex = 1.25, cex.lab = 1.25,
    col = factor(tsnona$type), bg = "lightblue", pch = 21)
points(bm$date, bm$`daily count modified`, col="darkred", cex=1.5)

# numerical method 
# GDP model ?
#https://rdrr.io/cran/evir/man/gpd.html
library(evir)
## 
## Attaching package: 'evir'
## The following object is masked from 'package:extRemes':
## 
##     decluster
## The following objects are masked from 'package:evd':
## 
##     dgev, dgpd, pgev, pgpd, qgev, qgpd, rgev, rgpd
## The following object is masked from 'package:ggplot2':
## 
##     qplot
modified_NoNA <- modified_ts %>% filter(!is.na(`daily count modified`))

gpd.model <- gpd(modified_NoNA$`daily count modified`, threshold = mean(thresholds), method = "ml")